Performance Analysis of Cache Oblivious Algorithms in the Fresh Breeze Memory

نویسنده

  • Joshua Foster
چکیده

The Fresh Breeze program execution model was designed for easy, reliable and massively scalable parallel performance. The model achieves these goals by combining a radical memory model with efficient fine-grain parallelsim and managing both in hardware. This presents a unique opportunity for studying program execution in a system whose memory behavior is not well understood. In this thesis, I studied the behavior of cache-oblivious algorithms within the Fresh Breeze model by designing and implementing a cache-oblivious matrix multiply within the Fresh Breeze programming framework, as well as a cache-naive algorithm for comparison. The algorithms were implemented in C, using the Fresh Breeze run-time libraries, and profiled on a simulated Fresh Breeze processor. I profiled both programs across a range of problem sizes, memory speeds and memory types in order to best understand their behavior and accurately characterize their performance. Thesis Supervisor: Jack B. Dennis Title: Professor Emeritus

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Is Cache Oblivious DGEMM a Viable Alternative?

We present an in-depth study of various implementations of DGEMM, using both the recursive and iterative programming styles. Recursive algorithms for DGEMM are usually cache-oblivious and they automatically block DGEMM’s operands A, B, C for the memory hierarchy. Iterative algorithms for DGEMM explicitly block A, B, C for the L1 cache, higher caches and memory. Our study shows that recursive DG...

متن کامل

Cache-Oblivious Hash Joins

Partitioning has been used to improve the performance of the hash join in the main memory; however, cache-conscious partitioning requires the knowledge about the cache parameters, such as the capacity and unit size, of a chosen level of the CPU caches, e.g., the L2 cache. Obtaining this knowledge and subsequently tuning the algorithm may be inconvenient, and sometimes infeasible, for complex sy...

متن کامل

An Optimal Cache-Oblivious Priority Queue and Its Application to Graph Algorithms

We develop an optimal cache-oblivious priority queue data structure, supporting insertion, deletion, and delete-min operations in O( 1 B logM/B N B ) amortized memory transfers, where M and B are the memory and block transfer sizes of any two consecutive levels of a multilevel memory hierarchy. In a cache-oblivious data structure, M and B are not used in the description of the structure. Our st...

متن کامل

An Experimental Comparison of Cache-oblivious and Cache-aware Programs DRAFT: DO NOT DISTRIBUTE

Cache-oblivious algorithms have been advanced as a way of circumventing some of the difficulties of optimizing applications to take advantage of the memory hierarchy of modern microprocessors. These algorithms are based on the divide-and-conquer paradigm – each division step creates sub-problems of smaller size, and when the working set of a sub-problem fits in some level of the memory hierarch...

متن کامل

A Comparison of Cache Aware and Cache Oblivious Static Search Trees Using Program Instrumentation

An experimental comparison of cache aware and cache oblivious static search tree algorithms is presented. Both cache aware and cache oblivious algorithms outperform classic binary search on large data sets because of their better utilization of cache memory. Cache aware algorithms with implicit pointers perform best overall, but cache oblivious algorithms do almost as well and do not have to be...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012